Multivariate mixed-effects ordinal logistic regression models with difference-in-differences estimator of the impact of WORTH Yetu on household hunger and socioeconomic status among OVC caregivers in Tanzania

Background Although most of the livelihood programmes target women, those that involve women and men have been evaluated as though men and women were a single homogenous population, with a mere inclusion of gender as an explanatory variable. This study evaluated the impact of WORTH Yetu (an economic empowerment intervention to improve livelihood outcomes) on household hunger, and household socioeconomic status (SES) among caregivers (both women and men) of orphaned and vulnerable children (OVC) in Tanzania. The study hypothesized that women and men respond to livelihood interventions differently, hence a need for gender-disaggregated impact evaluation of such interventions. Methods This is a secondary analysis of longitudinal data, involving caregivers’ baseline (2016–2019) and follow-up (2019–2020) data from the USAID Kizazi Kipya project in 25 regions of Tanzania. Two dependent variables (ie, outcomes) were assessed; household hunger which was measured using the Household Hunger Scale (HHS), and Socioeconomic Status (SES) using the Principal Component Analysis (PCA). WORTH Yetu, a livelihood intervention implemented by the USAID Kizazi Kipya project was the main independent variable whose impact on the two outcomes was evaluated using multivariate analysis with a multilevel mixed-effects, ordinal logistic regression model with difference-in-differences (DiD) estimator for impact estimation. Results The analysis was based on 497,293 observations from 249,655 caregivers of OVC at baseline, and 247,638 of them at the follow-up survey. In both surveys, 70% were women and 30% were men. Their mean age was 49.3 (±14.5) years at baseline and 52.7 (±14.8) years at the follow-up survey. Caregivers’ membership in WORTH Yetu was 10.1% at the follow-up. After adjusting for important confounders there was a significant decline in the severity of household hunger by 46.4% among WORTH Yetu members at the follow-up compared to the situation at the baseline (adjusted Odds Ratio (aOR) = 0.536, 95% Confidence Interval (CI) [0.521, 0.553]). The decline was 45.7% among women (aOR = 0.543 [0.524, 0.563]) and 47.5% among men (aOR = 0.525 [0.497, 0.556]). Regarding SES, WORTH Yetu members were 15.9% more likely to be in higher wealth quintiles at the follow-up compared to the situation at the baseline (aOR = 1.159 [1.128, 1.190]). This impact was 20.8% among women (aOR = 1.208 [1.170, 1.247]) and 4.6% among men (aOR = 1.046 [0.995, 1.101]). Conclusion WORTH Yetu was associated with a significant reduction in household hunger, and a significant increase in household SES among OVC caregivers in Tanzania within an average follow-up period of 1.6 years. The estimated impacts differed significantly by gender, suggesting that women and men responded to the WORTH Yetu intervention differently. This implied that the design, delivery, and evaluation of such programmes should happen in a gender responsive manner, recognising that women and men are not the same with respect to the programmes.


Introduction
Evaluation of programme impacts remains methodologically complex, especially in nonexperimental settings [1][2][3].Experimental designs with randomisation or randomised controlled trials (RCTs) are universally accepted as the gold standard for gauging programme impacts [1,3].This unique feature stems from their inherent ability to achieve similarity between treatment and control subjects in terms of both measured and unmeasured characteristics [4][5][6][7].However, RCTs are expensive to conduct and often involve many ethical issues, hence limited applicability [8].Due to this, non-experimental designs such as longitudinal, cohort, or case-control have been recommended as methodological alternatives [9][10][11][12].However, in non-experimental settings, the similarity of subjects cannot be guaranteed due to lack of randomisation, and being in treatment or control groups occurs on a self-selection basis, resulting into selection bias as a major limitation [1,3].Therefore, because of the selection bias, estimated impacts from non-experimental programmes can be biased by unmeasured confounders [13][14][15].In this case, impact evaluation calls for methodological approaches that can minimise selection bias as much as possible by accounting for as many variables (ie, sources of bias) as possible in the regression models or matching to reduce the variance represented by the error term and increase the precision of the programme impact [16].Further, impact evaluation of non-experimental programmes in different fields may require field-specific approaches to further minimise selection bias and improve the precision of the estimated impacts.
In the area of economic empowerment programmes, a massive evolution has happened over the last few decades with diverse programming modalities, hence the complex evaluation of their impacts [17].Extant evidence reveal clearly that, most of such programmes target women [18][19][20].Core explanations for this include the fact that in many parts of the world, women and girls suffer a substantial share of various forms of discrimination and vulnerability [21].Also, while women are the ones mostly affected by poverty, they are the key players in food production for their families, especially in low-and middle-income countries (LMICs) [22,23].Another explanation is that economic empowerment programmes to enhance livelihood outcomes are delivered as part of structural interventions for the prevention of Human Immunodeficiency Virus (HIV) because women are at higher risk for HIV acquisition than men [19].According to the United Nations Development Programme (UNDP), Tanzania's Gender Inequality Index (GII) for the year 2021 stood at 0.560, placing the country at the 146 th position globally [24].With this GII which surpassed the world's average of 0.465, Tanzania was classified as having low human development, underscoring the persistent challenge of inequality between women and men in the country [24].Therefore, recognising that gender inequality propels inequities in multiple dimensions including health and wellbeing [25], the United Nations fifth goal for sustainable development was explicitly set to achieve gender equality and empowerment of all women and girls [26].Due to this, several empowerment programmes have been implemented in the world, including those addressing food insecurity, hunger, and poverty [27,28].
Considering this background, evidence suggests that in economic empowerment programmes where both genders are involved (eg, Kakuhikire et al. [29]), it is likely that women and men respond to the interventions differently.Unfortunately, previous impact evaluations of the programmes involving both genders have treated women and men as a single homogenous population, lacking gender-disaggregation of the impacts.Attempts to integrate gender in some evaluations have mainly elucidated the interplay between gender norms and livelihood programming [30,31], comparing women and men in terms of livelihood activities [19,32], and earnings [19].Despite this, impact evaluation of gender-responsive programmes has not been so common.Yet it is equally needed to inform strategies for efficient design, delivery, and evaluation of livelihood interventions for gender-equitable livelihood outcomes.
In view of this, this paper evaluates the impact of WORTH Yetu on livelihood outcomes, namely, household hunger, and household socioeconomic status (SES).The study also attempts to recognise the differences between women and men in terms of the impact of WORTH Yetu on each of the two outcomes as a way of being scientifically informed regarding the significance of gender disaggregation in impact evaluation of non-experimental economic empowerment programmes.The analysis herein is based on WORTH Yetu, an economic empowerment intervention implemented under the USAID Kizazi Kipya project (2016-2021) to improve livelihood outcomes among caregivers of orphans and vulnerable children (OVC) in Tanzania [33].The project was geographically large-scale and nationally representative, reaching hundreds of thousands of households in Tanzania caring for OVC to improve their health and wellbeing.The analysis of WORTH Yetu impact on livelihood outcomes (ie, household hunger, and socioeconomic status (SES)) is intended to go beyond the common computation of frequencies and percentages by gender or mere inclusion of gender as a control variable in regression analysis, (eg, Embleton et al. [34]) but advances to handling men and women as separate populations and proceeding to evaluate programme impacts in each, then comparing, contrasting, and explaining their similarities, differences, and implications.This will generate evidence of not only how effective the intervention was, but also inform options and strategies for achieving more effectiveness of the programmes, thereby contributing to the core purposes of impact evaluation-accountability and learning [2].This evaluation approach is consistent with the Realist Evaluation (RE) theory [35].The RE theory was designed to enhance the understanding of how different interventions work in different contexts.The theory attempts to elucidate what works, for whom, in which situations, why does it or does not work and over what duration [36,37].As such, the theory assumes that no intervention works everywhere for everyone, which is why context is vital.The theory deals with the causal mechanism that enables the programme to work [37].The theory operates on the configurations of C + M = O, whereby C represents the context, M stands for the mechanism and O represents the outcome.The configuration describes how specific contextual factors work to trigger mechanisms, and how the combination brings about certain outcomes [35].

Study design and settings
The present study is longitudinal in design, involving secondary data from the USAID Kizazi Kipya project.The data are from 81 district councils in 25 regions of Tanzania where the project enrolled beneficiaries (baseline) during 2016-2019, with a follow-up assessment during 2019-2020.Regions in Tanzania where the USAID Kizazi Kipya project was not implemented were excluded from this study as the necessary data were unavailable.The project aimed to scale-up the uptake of HIV services, other health services, as well as social services by Tanzanian OVC and their caregivers through a Pact-led consortium of non-governmental organizations (NGOs), Civil Society Organizations (CSOs), and the Government of the United Republic of Tanzania at national, regional, district, and community levels [38].The project provided services in the areas of health, economic empowerment, education and other social services to OVC, vulnerable youth and their caregivers.At the community level, volunteers known as Community Case Workers (CCWs) and Lead Case Workers (LCWs) supported the implementation of the project by identifying services needed by each enrolled beneficiary, then proceeding to delivery of certain services while providing referrals for other services that the CCWs and LCWs were not mandated to provide directly.

Study population
OVC caregivers aged 18 years or more constituted the current study population.The caregivers included herein were the beneficiaries of the USAID Kizazi Kipya project enrolled (ie, baseline) from 24th November 2016 to 30th October 2019 and later reassessed in a followed-up survey from 1st February 2019 to 30 September 2020.From enrollment to the follow-up survey, each caregiver was assessed twice with the FCAA tool.By definition, the project defined a caregiver as a parent/guardian who has the greatest responsibility (ie, primary) in caring for one or more OVC in one household [39].In the enrollment process, one primary caregiver in a household was registered in the project and issued a unique caregiver identification number (CGID).Therefore, the number of caregivers was equal to the number of households enrolled in the project.In the extraction process, 250,668 caregivers had matching CGID in the baseline and follow-up datasets from the USAID Kizazi Kipya project database.Although 2,323 of these had matching CGIDs in the follow-up dataset, they had no data on all the variables, suggesting that they were lost to follow-up (LTFU), leaving 248,345 caregivers at baseline with data at follow-up.Upon further explorations of the datasets, 1,013 and 707 caregivers from baseline and follow-up datasets, respectively, were excluded because they had missing observations in one or more of the variables included in the analysis.This process resulted in 249,655 caregivers at baseline and 247,638 caregivers at the follow-up with eligible data for the current analysis.Since this was a longitudinal study, with each caregiver expected to have two observations (ie, one at baseline and another one at follow-up), the analysed data was distributed as 247,447 caregivers each observed twice, making up 247,447*2 = 494,894 observations; 2,208 caregivers with baseline data only, making up 2,208*1 = 2,208 observations; and 191 caregivers with follow-up data only, making up 191*1 = 191 observations.Therefore, the final analysis was based on 249,846 caregivers with a total of 497,293 observations representing baseline and follow-up assessments (Fig 1).

Data source
As highlighted, the USAID Kizazi Kipya was a community-based project implemented in Tanzania for five years, from 2016 to 2021.The primary beneficiaries of the project were OVC and their caregivers.The project provided services to enrolled beneficiaries in several dimensions, including HIV services, other health services, food and nutrition, psychosocial care and support, and economic strengthening (ES).

The WORTH Yetu intervention
From the project, data pertaining to the ES intervention was sought for the purpose of the present study.The ES intervention under the project was intended to ensure that the caregivers have the financial resources to meet the needs of the OVC by improving their livelihoods, employment skills, and life skills as a critical pathway towards growth and reduction of their economic vulnerability.All caregivers were eligible, and hence informed of the ES intervention under the project, but membership or participation in the intervention was voluntary.The ES intervention was delivered through WORTH Yetu groups which were locally formed, requiring members to meet weekly with mandatory and voluntary savings to create a base for individual loans as well as startup projects for the groups.WORTH Yetu members had access to financial literacy, an opportunity to save as well as access to microcredits from financial institutions and other sources.Through the WORTH Yetu groups, members were enabled to start group projects, such as farming, animal husbandry, and horticulture [40].
The WORTH Yetu groups were facilitated by Livelihoods Volunteers (LVs), a cadre formerly recruited at the ward level with support from the respective Ward Executive Officers (WEOs).Their recruitment process involved written and oral interviews as well as vetting by the Local Government Authority (LGA) at the ward level.Each LV supported a maximum of 10 groups with at least 150 members from targeted OVC households.The USAID Kizazi Kipya project provided financial and technical inputs to support LVs to deliver quality services to the WORTH Yetu groups.This required each LV to attend training sessions organized by CSO's Economic Strengthening and Livelihood Officers (ESLOs) on key curriculums for the effective functioning of WORTH Yetu groups.The trained LV cascaded the training to the management committee of each WORTH Yetu group and members.Each group received two or more visits every month from the LVs.The LV was also responsible for coaching groups and facilitating linkages to other ES opportunities.It was also a requirement that the LV meets their respective ESLO every month for training and submission of progress reports of their groups.

Variables
Dependent variables.Two dependent variables (ie, outcomes) were assessed in the present study, namely, the level of household hunger, and household socioeconomic status (SES).Both variables were measured objectively as ordinal variables as described below.
Household hunger.The level of household hunger was determined using the Household Hunger Scale (HHS).The HHS was established by the Food and Agriculture Organization (FAO) and the Tufts University through the Food and Nutrition Technical Assistance III Project (FANTA) [41].The HHS is an improved version of the Household Food Insecurity Access Scale (HFIAS).The HHS was formed by reducing the HFIAS to three questions after internal and external validations in Africa and Asia [41,42].The three questions which the HHS utilises in the determination of the level of household hunger are: (1) "In the past 4 weeks, how often was there ever no food to eat of any kind in your household because of lack of resources to get food?",(2) "In the past 4 weeks, how often did any household member go to sleep at night hungry because there was no enough food?", and (3) "In the past 4 weeks, how often did any household member go whole day and night without eating anything?".Each of the three questions has four possible responses: never, rarely (once or twice), sometimes (3 to 10 times) and often (more than 10 times).According to the HHS, these responses are reorganised in such a way that the first category, never, is coded '0', the subsequent two categories, rarely (once or twice) and sometimes (3 to 10 times) are coded '1', and the last category, often (more than 10 times), is coded '2'.Then a row sum of the codes across the three questions is computed to generate a hunger score for each household.The resulting variable has hunger score values ranging from 0 to 6, such that the higher the score the more severe the level of household hunger.Ultimately, based on the scores, the HHS classifies households in three hungerdefining categories in the increasing order of hunger severity as (1) little to no hunger, (2) moderate hunger, and (3) severe hunger [41] as mathematically represented below.SES.Household SES was assessed using the Principal Component Analysis (PCA) of household-owned assets [43].Household assets involved in the PCA were whether the main dwelling material for the household is concrete, cement, aluminium and/or other materials, whether the household owns land, chicken, goats, cows, bicycles, tractors, motorcycles, motor vehicles, ovens, and hair driers.The final SES variable from the PCA was ordinal in structure, with five categories known as wealth quintiles, from the lowest quintile (Q1) to the highest quintile (Q5) for the poorest households and the wealthiest households, respectively, as represented below.

Socioeconomic status ðSESÞ ¼
Independent variables.WORTH Yetu was the main independent variable of interest for this study.This was the livelihoods intervention programme whose impact on the dependent variables described above was assessed.The variable was binary, representing whether the caregiver was a member ('1') or not a member ('0') of WORTH Yetu between baseline and follow-up periods of assessment.Gender was another main independent variable which was time-independent, recognizing the caregiver as either a man or a woman based on their selfidentification during enrollment.
Other independent variables included were age in groups of 18-29 years, 30-39 years, 40-49 years, 50-59 years, and 60+ years.Level of formal education attained was also included amongst the independent variables (never attended, primary, and secondary or more), as well as marital status (married or living together, divorced or separated, never married, and widow or widower), family size (2-3 people, 4-6 people, and 7+ people), whether one or more family members has health insurance (yes, and no), HIV status (negative, positive, and unknown or undisclosed), place of residence (rural and urban), and whether the caregiver was physically or mentally disabled (yes, and no).The disability was assessed at enrollment based on physically observable conditions and limitations of the caregiver, such as blindness, physical disability etc. as described elsewhere [44].The source of data and all variables used for this study is the baseline and follow-up surveys conducted among OVC caregivers of the USAID Kizazi Kipya project in Tanzania.

Data analysis
Both descriptive and inferential statistical techniques were applied in the current study.In the descriptive part, the frequency distribution of the respondents was computed through oneway tabulations of each of the variables at baseline and follow-up.This was followed by twoway tabulations of each of the outcomes by WORTH Yetu and each of the described independent variables, with a Chi-square (χ 2 ) test to gauge the degree of association between them.
In the inferential analysis, multivariate analysis to evaluate the impact of WORTH Yetu on both household hunger, and SES was conducted using a multilevel mixed-effects ordinal logistic regression model with a DiD estimator using Stata's "meologit" syntax.The model operates on the condition that numerical values representing the categories of each of the outcomes are not relevant, except that larger values correspond to higher outcomes.The choice of this model was motivated by a consideration that both outcomes were inherently ordinal variables, with the underlying assumption being that the three categories of household hunger are in the increasing order of hunger severity, and the five categories of SES (ie, wealth quintiles) are in the order of increasing socioeconomic wellbeing.In both cases, the categories have natural ordering, but the distances between adjacent categories of each variable are unknown [45].
Also, since the data was longitudinal, we assumed that observations of the same caregiver are correlated.So, a two-level multilevel model with a random intercept was defined, whereby observations (level 1) were nested within caregivers (level 2).In the analysis, a full model was fitted for all observations of the caregivers, after which separate models-one for women's observations and another for men's observations-were then fitted.In both cases, WORTH Yetu impact on each of the outcomes was evaluated using the DiD estimator through an interaction term between WORTH Yetu ('0' = non-member, and '1' = member) and time ('0' = baseline, and '1' = follow-up), controlling for potential confounders.The full model was used to gauge the overall WORTH Yetu impact on each of the outcomes, as well as the significance of gender as an indicator of how similar or different women and men responded to the WORTH Yetu intervention.The purpose of the separate models (for women and men) was two-fold: (1) to compare the magnitude of the impact of WORTH Yetu on household hunger and SES between men and women, and (2) to compare the extent to which other factors that influence household hunger and SES are similar or different between men and women.
The basic form of a two-level multilevel model for an ordinal outcome variable with a random intercept can be described as follows.Given an ordinal outcome variable such as SES, the basic conception is that behind the observed ordinal variable, there exists an underlying latent continuous variable that is not measured directly [46].Denoted as Y * ij , a model for the latent continuous outcome variable can be represented as follows, considering the context of the current study: - To link the Y * ij and the observed ordinal outcomes Y ij , a threshold model is defined.For Y ij ordinal categories, c = 1, 2, 3, . .., C, a threshold model can be represented as Where: k c is a threshold parameter, and the thresholds are in an increasing order, such that k 1 < k 2 < k 3 . ..< k c-1 .When Y * ij increases past a given threshold, there is a discrete jump in the observed ordinal/ordered categories of Y ij .For example, when Y * ij exceeds the threshold k 1 , Y ij changes from 1 to 2; when Y * ij exceeds the threshold k 2 , Y ij changes from 2 to 3, etc.The random effects at level 2 are assumed to be normally distributed, such that, u oj �Nð0; d 2 u Þ for all caregivers.For level 1 residuals, and considering the logit specification, ε ij � logisticð0; p 2 =3Þ for all observations, leading to a multilevel cumulative logit model as described by Bauer and Sterba (2011) [46].The random parameters are independent of one another-ie, Cov(u oj , In the framework of generalized linear models, the same cumulative multilevel logit model is expressed as: Where Pr(Y ij ) is the cumulative probability that a response of the ordinal outcome variable will be recorded in category k or below.
is a linear predictor constituting a linear combination of WORTH Yetu and other observed factors and random effects.Again, k c is a threshold parameter, f −1 (.) is the inverse link function that maps the continuous nature of [k c −η ij ] into the asymptotes of 0 and 1 of the predicted values [47,48].

Ethics approval and consent to participate
This study received an ethics approval from the Institutional Research Review Ethics Committee (IRREC) of the University of Dodoma in Tanzania (MA.84/261/61/57).The data had a prior ethics approval from the Medical Research Coordinating Committee (MRCC) of the National Institute for Medical Research (NIMR) in Tanzania, with ethics clearance certificate number NIMR/HQ/R.8a/Vol.IX/3024, also described elsewhere [39].The data represent beneficiaries of the USAID Kizazi Kipya project whose households were enrolled in the project voluntarily.The screening and enrollment form included a section where caregivers who consented to participate in the project signed as evidence that they had been informed about the project, and that they were voluntarily willing to participate.Datasets provided for this study were anonymous, securely stored, and only accessible to the authors.

Profile of respondents
The present analysis was based on observations from 249,655 caregivers at the baseline, and 247,638 of them at the follow-up survey.By gender, 70.0% of the caregivers were women and the rest 30.0%were men.Overall, their mean age was 49.3 (±14.5) years at baseline and 52.7 (±14.8) years at the follow-up survey.These values were different by gender as women were relatively younger than men.At the baseline, women's mean age was 48.0 (±14.4) years and men's was 52.3 (±14.3)years, and at the follow-up survey, so was 51.4 (±14.7) years for women and 55.7 (±14.5) years for men, and the differences in mean age between women and men at baseline and follow-up were statistically significant (p < 0.001).
At the time of the follow-up survey, membership, or participation in WORTH Yetu was 10.1% of all the caregivers analysed.Since WORTH Yetu was a USAID Kizazi Kipya projectsupported livelihoods programme, membership was at 0.0% at the baseline because there were no project services before enrollment.Further details regarding other background characteristics of the caregivers at the baseline and at the follow-up surveys in Table 1, and disaggregation of the same characteristics by gender is presented as supporting information in S1 Table.

WORTH Yetu members' and non-members' characteristics at baseline
Table 2 shows the baseline characteristics of the OVC caregivers who were members and nonmembers of WORTH Yetu.While members and non-members of WORTH Yetu were similar in some baseline characteristics, the results revealed notably large differences in most characteristics, including place of residence, family size, and age.The observed differences in the baseline characteristics confirmed the existence of selection bias inherent in programmes that are non-experimental by design [3].

Levels of household hunger at baseline and at follow-up
There was a significant change (p < 0.001) in levels of household hunger between baseline and follow-up surveys.As shown in Table 3, households with little to no hunger (food secure) increased from 25.7% (25.2% women and 27.0% men) at baseline to 31.3% (30.8% women and 32.6% men) at the follow-up survey; moderate hunger declined negligibly from 65.5% (66.1% women and 63.9% men) at baseline to 65.4% (65.8% women and 64.4% men) at the follow-up; and severe hunger declined from 8.8% (8.7% women and 9.1% men) at baseline to 3.3% (3.4% women and 3.0% men) at the endline survey.The observed positive changes in the levels of household hunger by gender, appeared to be more among men than women, especially at the follow-up survey.

Levels of household hunger at follow-up by WORTH Yetu membership status
Overall, 31.3%, 65.4%, and 3.3% of the caregivers were in households with little to no hunger (food secure), moderate hunger, and severe hunger, respectively at the follow-up survey.These percentages were significantly different by WORTH Yetu membership status (p < 0.001), whereby, the percent of caregivers in little to no hunger households increased from 30.5% (30.0%women and 31.8%men) among non-members to 38.4% (37.9% women and 39.6% men) among WORTH Yetu members; moderate hunger declined from 66.1% (66.6% women and 65.2% men) among non-members to 58.5% (59.1% women and 57.1% men) among WORTH Yetu members; and severe hunger declined from 3.4% (3.5% women and 3.0% men) among non-members to 3.1% (2.9% women and 3.4% men) among WORTH Yetu members (Table 4).

SES at baseline and at follow-up
There was a significant change in SES between baseline and follow-up surveys overall and for both women and men (p < 0.001).Briefly, caregivers in the lowest wealth quintile declined from 34.8% (39.0%women and 25.2% men) at baseline to 30.8% (34.2% women and 22.9% men) at the follow-up survey; the richest wealth quintile did not change and remained at 19.6%, but differences between women and men existed-15.4% among women and 29.2% among men at baseline; and 15.9% among women and 28.4% among men at the follow-up survey.More details about changes in SES between baseline and follow-up are presented in Table 5.

Results of multivariate analysis
Impact of WORTH Yetu on household hunger.In the multivariate analysis (Table 7), after adjusting for other factors, namely, sex, education, marital status, age, health insurance, place of residence, disability status, and family size, the study found that: There was a significant decline in the severity of household hunger by 33.3% among nonmembers of WORTH Yetu, but the decline became as large as 46.4% among WORTH Yetu members at the follow-up compared to the situation at the baseline (non-members at followup: aOR = 0.667, 95% CI [0.659, 0.676]; WORTH Yetu member at follow-up: aOR = 0.536, 95% CI [0.521, 0.553]) (Table 7, Model 1).

Table 3. Percent and corresponding 95% confidence interval (CI) of OVC caregivers in different levels of household hunger at baseline and at follow-up, disaggregated by gender.
Most of the other factors which influenced the level of household hunger were similar by gender, except age, whereby the likelihood to experience the more severe forms of household hunger declined in a dose-response fashion in all age groups above 29 years for women, but for men, the decline was significant only in age groups above 39 years.This suggested that the protective effect of age against household hunger was not equally felt between women and men, implying that women were more likely to be food secure at a younger age than men (Table 7, Models 2 and 3).
The intraclass correlation coefficient (ICC) for each of Models 1, 2, and 3 in Table 7 was 20%, representing the amount of correlation between observations of the same caregiver.For each of the models, the p-value from the likelihood-ratio (LR) test that a variance component was zero was <0.001, emphasizing that fitting the regression models while recognizing the clustering of observations within caregivers was statistically more appropriate than fitting the standard models.

Impact of WORTH Yetu on SES
Table 8 presents the impact of WORTH Yetu on SES, after addressing selection bias in terms of gender, age, marital status, education, HIV status, place of residence, health insurance, disability status, and family size.
Results reveal (in Table 8, Model 1) that non-members in WORTH Yetu were 14.9% less likely to be in higher wealth quintiles at the follow-up (aOR = 0.851, 95% CI [0.842, 0.861]), while WORTH Yetu members were 15.9% more likely to be in higher wealth quintiles at the follow-up compared to the situation at the baseline (aOR = 1.159, 95% CI [1.128, 1.190]).
In the disaggregated analysis, women who were not in WORTH Yetu were 12.6% less likely to be in higher wealth quintiles at the follow-up (aOR = 0.874, 95% CI [0.862, 0.886]), while women who were WORTH Yetu members were 20.8% more likely to be in higher wealth quintiles at the follow-up (aOR = 1.208, 95% CI [1.170, 1.247]) compared to the situation at the baseline (Table 8, Model 2).For men (Table 8, Model 3), non-members of WORTH Yetu were 18.9% less likely to be in higher wealth quintiles (aOR = 0.811, 95% CI [0.795, 0.828]), while men who were WORTH Yetu members were 4.6% more likely to be in higher wealth quintiles at the follow-up (aOR = 1.046, 95% CI [0.995, 1.101]) compared to the situation at the baseline.This effect was not statistically significant at the 5% level but indicates that the WORTH Yetu intervention was protective against the loss of household assets.The ICC for the three models in Table 8 was 27% for each of Model 1 and Model 2, and 26% for Model 3. Again, this indicated the degree of correlation of observations of the same caregiver, favouring the use of multilevel models which account for within-cluster correlations over standard models [49].The LR test indicated that the variance component in each of the models was not zero (p < 0.001), hence the multilevel models were appropriately used in this case over standard models.

Discussion
This study investigated the significance of gender disaggregation in impact evaluation of nonexperimental livelihood interventions, based on the analysis of WORTH Yetu impact on household hunger, and SES among OVC caregivers in Tanzania.For each of the two outcomes, a multivariate model was fitted for all the caregivers, after which two separate models, one for women and another for men, followed.In each of the models, potential confounders controlled for to account for selection bias were age, marital status, education attained, health insurance, HIV status, place of residence, disability status, and family size.After adjusting for these factors, the overall findings revealed a significant decline in household hunger by 46.4% among WORTH Yetu members at the follow-up compared to the situation at the baseline (aOR = 0.536, 95% CI [0.521, 0.553], p < 0.001).In the gender disaggregated models (ie, within gender comparisons), the decline was 45.7% among women who were members of WORTH Yetu (aOR = 0.543, 95% CI [0.524, 0.563], p < 0.001) and 47.5% among men who were members of WORTH Yetu (aOR = 0.525, 95% CI [0.497, 0.556], p < 0.001) at the follow-up compared to their respective situations at the baseline.These findings are consistent with those from experimental programmes such as the Chuma na Uchizi, a livelihood intervention that reduced food insecurity among PLHIV in Zambia [50].
Regarding household SES, the odds of being in higher wealth quintiles was significantly 1.159 times higher among WORTH Yetu members at the follow-up compared to the situation at the baseline (aOR = 1.159, 95% CI [1.128, 1.190], p < 0.001).After disaggregating the analysis by gender (ie, within gender comparisons), the odds of being in higher wealth quintiles was significantly 1.208 times higher among women who were WORTH Yetu members (aOR = 1.208, 95% CI [1.170, 1.247], p < 0.001) and 1.046 times higher among men who were WORTH Yetu members (aOR = 1.046, 95% CI [0.995, 1.101], p = 0.080) at the follow-up compared to their respective situations at the baseline.The intervention's impact among men was positive, but not statistically significant.Of note is that, without WORTH Yetu (ie, intervention non-recipients), the likelihood of being in higher wealth quintiles was significantly declining overtime in the overall model (p < 0.001), as well as in the women's (p < 0.001) and men's (p < 0.001) models.These findings suggested that while the WORTH Yetu intervention facilitated household asset acquisition among members overall, and more so among women, the intervention protected household asset loss (with no significant evidence of improved asset acquisition) overtime among men.
Between-gender comparison showed that, men and women were significantly different with respect to WORTH Yetu impact on both outcomes-household hunger, and SES.Specifically, men were significantly more likely than women to be in more severe forms of household hunger than women at the follow-up compared to the baseline situation.However, men were more likely than women to be in higher wealth quintiles than women at the follow-up survey than the situation at the baseline.
In addition, other factors, apart from the WORTH Yetu intervention, which influenced both household hunger and SES outcomes were not perfectly the same for men and women.Of course, many factors exerted a similar influence on women and men for both outcomes, but some were stronger for one gender than the other.For example, the likelihood of being in more severe forms of household hunger declined as age increased for all age groups above 29 years for women, but so was not the case until after age 39 years for men.In other words, age was a protective factor against household hunger, but the protection was stronger and felt early at younger ages in women than in men.Although the overall relationship of age and household hunger observed in the present study is consistent with other studies [51][52][53][54], a positive association between female gender and food security has been noted in some studies eg, [55].Therefore, interventions aiming at addressing household hunger in vulnerable populations such as OVC caregivers should be designed in a gender-responsive manner, recognising that men may require additional support and strategies to optimize programme impacts.
Also, the influence of HIV status on SES was very different between women and men.Overall, caregivers LHIV were significantly less likely to be in higher wealth quintiles at the followup than those who were HIV negative.Those of unknown HIV status were similar to those who were HIV negative.While this observation was similar as that among women, for men, those who were HIV positive were significantly more likely to be in higher wealth quintiles than their HIV negative counterparts; and those of unknown HIV status were significantly less likely to be in higher wealth quintiles than those who were HIV negative.Although the underlying mechanism of these results is not clear, strategies to improve the outcomes among the intervention recipients should be tailored to their HIV status, so that those at a low chance of benefiting from the intervention are given more support as needed.Other factors, namely, marital status, education, place of residence, disability status, and health insurance were discussed in the earlier study largely based on the same population [40].
All these differences between women and men emphasise that the genders are different, and indeed highlight the significance of gender disaggregation in impact evaluation of nonexperimental livelihood or economic empowerment interventions.These disparities may, in part, stem from differences in access to resources including education, employment opportunities, and control over income.Additionally, traditional gender roles, which allocate varied responsibilities such as household chores for women and breadwinning for men within households and communities [56], contribute to shaping how individuals experience and respond to interventions.This is in line with the Realist Evaluation theory which posits no intervention works everywhere and for everyone, which is why the focus should be to find what works, for whom, and why it does or does not work [35].

Strengths and limitations
This study is based on a large sample size along a national-wide geographical coverage, permitting the results to be nationally representative.Also, statistical methods employed in the evaluation of the impact of WORTH Yetu on household hunger and SES are scientifically rigorous and addressed selection bias based on a wide range of potential confounders adjusted for in the multivariate models.This guarantees that the estimated impacts are as close to reality as possible, leaving a minimal possibility that the findings are due to chance or confounding.
Although many factors were adjusted for to address selection bias in this study, we acknowledge a possibility of residual confounding, especially due to factors which were not available for inclusion in the analysis.

Conclusion
The present study found that WORTH Yetu reduced household hunger on one hand, and improved household SES on the other, with significant variations in the observed impacts between women and men.WORTH Yetu reduced the likelihood of being in more severe forms of household hunger by 46.4% overall, and 45.7% for women and 47.5% for men within the average follow-up period of 1.6 years from the baseline to the follow-up survey.With respect to SES, WORTH Yetu improved the likelihood of being in higher wealth quintiles by 15.9% overall, and by 20.8% for women and only 4.6% for men within the same period.
Between gender comparisons emphasised that while men were significantly more likely to experience severe forms of household hunger than women, men were more likely to be in higher wealth quintiles than women.
For SES, findings clearly suggest that the WORTH Yetu intervention was significantly effective in improving household SES, particularly for women.However, for men, the WORTH Yetu impact on SES was positive, but not statistically significant.A common observation in all the three models is that non-members of WORTH Yetu were significantly less likely to improve their SES at the follow-up compared to the situation at the baseline.This partly suggested that even if WORTH Yetu did not significantly improve SES for men, it was at least protective against further loss of household assets over time.
Without gender disaggregation, the observed differences between women and men with respect to the WORTH Yetu impact on household hunger and SES would not have been detected.Many similar studies have simply included gender or sex as one of the explanatory variables, eg, [34], for which in our case, we would have ended up with the odds of higher wealth quintiles being 1.536 times higher for men compared to women.But the deeper analysis uncovered further that women and men were significantly different with respect to the WORTH Yetu impact on SES, whereby the gain in SES due to the WORTH Yetu intervention was significantly higher among women (when comparing women to women) than men (when comparing men to men).
Overall, women and men experienced the livelihood outcomes attributable to the WORTH Yetu intervention differently, highlighting the distinct nature of these populations in the context of economic empowerment programmes.These findings emphasize the importance of prioritising gender as a critical dimension in the design, delivery, and evaluation of livelihood programmes.Moreover, accelerating the coverage of the WORTH Yetu intervention is essential as a viable strategy to combat household hunger and enhance the socioeconomic wellbeing of families caring for OVC and other vulnerable populations.This may require strategies that are responsive to gender-specific needs and differences to maximise the gains of the interventions, eg, providing targeted support to female caregivers LHIV and male caregivers of undisclosed HIV status to enhance their SES etc.These results can likely be applied in similar contexts and settings to appropriately gauge impacts of similar programmes.

Fig 1
Fig 1 is the flow diagram of the number of caregivers and their baseline and follow-up observations included in the analysis.In the extraction process, 250,668 caregivers had matching CGID in the baseline and follow-up datasets from the USAID Kizazi Kipya project database.Although 2,323 of these had matching CGIDs in the follow-up dataset, they had no data on all the variables, suggesting that they were lost to follow-up (LTFU), leaving 248,345 caregivers at baseline with data at follow-up.Upon further explorations of the datasets, 1,013 and 707 caregivers from baseline and follow-up datasets, respectively, were excluded because they had missing observations in one or more of the variables included in the analysis.This process resulted in 249,655 caregivers at baseline and 247,638 caregivers at the follow-up with eligible data for the current analysis.Since this was a longitudinal study, with each caregiver expected

Fig 1 .
Fig 1. Flow diagram of the number of OVC caregivers (n) and their observations (m) included in the analysis at baseline and follow-up.Notes for Fig 1 n represents the number of caregivers, and m represents the number of caregivers' observations.m = 2n: Each caregiver has two observations; one at the baseline and another at the follow-up.m = n: Each caregiver has one observation only; either at the baseline or at the follow up.https://doi.org/10.1371/journal.pone.0301578.g001

Table 6
compares SES at the follow-up survey between WORTH Yetu members and nonmembers, for both women and men.Findings show that the lowest wealth quintile declined from 32.4% among non-members to 16.5% among WORTH Yetu members and the richest wealth quintile increased from 18.9% (15.1% women and 27.6% men) among non-members to

Table 2 . Baseline characteristics of OVC caregivers who were members and non-members of WORTH Yetu at the follow-up.
4% women and 35.7% men) among WORTH Yetu members at the follow-up survey.The other wealth quintiles (i.e., second, middle, and fourth) changed positively in favour of WORTH Yetu members with differences between women and men.